DUULEY  KNOX  LIBRARY 
NAVAL  POSTGRADUATE  SCHOOL 
MONTEREY.   CALIFORNIA   93943 


NAVAL  POSTGRADUATE  SCHOOL 

Monterey,  California 


THESIS 


A  PERFORMANCE  EVALUATION  MODEL 

FOR  THE  STOCKPOINT  LOGISTICS 

INTEGRATED  COMMUNICATION  ENVIRONMENT 

(SPLICE) 


by 


Jonathan  B.  Schmidt 
September  1985 


Thesis  Advisor 


N.  F.  Schneidewind 


Approved  for  public  release;  distribution  is  unlimited 


T226832 


SECURITY  CLASSIFICATION  OF   THIS  PACE  (Whan  Data  Bntarad) 


REPORT  DOCUMENTATION  PAGE 


1       REPORT   NUMBER 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


2.  GOVT  ACCESSION   NO 


3.     RECIPIENT'S  CATALOG  NUMBER 


4.     TITLE  {and  Subtltla) 


A  Performance  Evaluation  Model  for  the 
Stockpoint  Logistics  Integrated 
Communication  Environment  (SPLICE) 


5.     TYPE  OP   REPORT  &   PERIOD  COVERED 

Master's  Thesis 
September  1985 


S.  PERFORMING  ORG.  REPORT  NUMBER 


7.  AUTHOR^*; 

Jonathan  B.  Schmidt 


8.     CONTRACT  OR  GRANT  NUMBERCtJ 


•  ■    PERFORMING  ORGANIZATION  NAME  AND  AOORESS 

Naval  Postgraduate  School 
Monterey,  California  93943-5100 


10.     PROGRAM  ELEMENT.  PROJECT,   TASK 
AREA  &   WORK  UNIT  NUMBERS 


H.     CONTROLLING  OFFICE  NAME  ANO  ADDRESS 

Naval  Postgraduate  School 
Monterey,  California  93943-5100 


12   REPORT  OATE 

September  1985 


13.     NUMBER  OF   PAGES 
68 


14.     MONITORING  AGENCY  NAME  ft.    ADDRESS^//  dlltarant  from  Comrolllna  Ottlca) 


15.     SECURITY   CLASS,   (of  thta  report) 


UNCLASSIFIED 


IS*.     DECLASSIFICATION    DOWNGRADING 
SCHEDULE 


1«.     DISTRIBUTION  STATEMENT  (o(  (hit  Raport) 

Approved  for  public  release;  distribution  is  unlimited 


17.     DISTRIBUTION  STATEMENT  (el  tha  abatract  antarad  In  Block  20,  II  dlltarant  trom  Raport) 


<■.      SUPPLEMENTARY   NOTES 


1*.     KEY  WOROS  (Contlnua  on  rararaa  aid*  it  nacaaaary  and  Idantlfy  by  block  numbar) 

SPLICE,  local  area  network,  computer  performance  evaluation, 
monitoring,  performance  studies,  network  management,  performance 
measurement 


20.     ABSTRACT  'Contlnua  on  ravaraa  alda  II  nacaaaary  and  Idantlfy  by  block  numbar) 

This  thesis  investigates  ways  of  improving  the  real-time  perform- 
ance of  the  Stockpoint  Logistics  Integrated  Communication 
Environment  (SPLICE).   Performance  evaluation  through  continuous 
monitoring  activities  and  performance  studies  are  the  principle 
vehicles  discussed.   The  method  for  implementing  this  performance 
evaluation  process  is  the  measurement  of  predefined  performance 
indexes.   Performance  indexes  for  SPLICE  are  offered  that  would 
measure  these  areas.   Existing  SPLICE  capability  fContinuedl 


DO 


FORM 
1   JAN   71 


1473  EDITION   OF    I    MOV  «S   IS  OBSOLETE 

S/N  0102-  LP.014-6601 


SECURITY   CLASSIFICATION   OF   THIS  PAGE  (Whan  Data  Bntar: 


secuHiTY  cl A»*iric»TiON  or  this  *agb  C*hm*  Dm*  *nft*o 


ABSTRACT  CONTINUED 

to  carry  out  performance  evaluation  is  explored,  and  recommenda- 
tions are  made  to  enhance  that . capabil ity . 


Approved  for  public  release;  distribution  is  unlimited 


A  Performance  Evaluation  Model 
for  the  Stockpoint  Logistics 
Integrated  Communication  Environment  (SPLICE) 

by 

Jonathan  B.  Schmidt 
Lieutenant  Commander,  United  States  Navy 
B.A.,  University  of  Missouri,  1971 


Submitted  in  partial  fulfillment  of  the 
requirements  for  the  degree  of 


MASTER  OF  SCIENCE  IN  INFORMATION  SYSTEMS 

from  the 


NAVAL  POSTGRADUATE  SCHOOL 
September  1985 


ABSTRACT 

This  thesis  investigates  ways  of  improving  the  real-time 
performance  of  the  Stockpoint  Logistics  Integrated 
Communication  Environment  (SPLICE).   Performance  evaluation 
through  continuous  monitoring  activities  and  performance 
studies  are  the  principle  vehicles  discussed.   The  method  for 
implementing  this  performance  evaluation  process  is  the 
measurement  of  predefined  performance  indexes.   Performance 
indexes  for  SPLICE  are  offered  that  would  measure  these  areas. 
Existing  SPLICE  capability  to  carry  out  performance  evaluation 
is  explored,  and  recommendations  are  made  to  enhance  that 
capabi 1 i  ty . 
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I. INTRODUCTION 

Satisfactory  performance  is  an  objective  that  is  often 
unstated  in  the  day-to-day  operation  of  a  computer  system.   To 
gauge  the  fulfillment  of  this  objective,  operators  must  lay  a 
foundation  in  the  form  of  articulated  standards,  and  then 
employ  the  proper  tools  to  measure  these  standards  against 
current  performance.   It  is  our  intent  to  describe  a 
qualitative  model  of  a  computer  performance  evaluation  (CPE) 
system  that  will  facilitate  the  efficient  real-time  operation 
of  an  information  processing  system  composed  of  two  or  more 
host  computers  interconnected  via  a  local  area  computer 
network.   The  model  will  be  illustrated  through  application  to 
the  U.S.  Navy's  Stockpoint  Logistics  Integrated  Communication 
Environment  (SPLICE),  a  series  of  62  computer  systems 
currently  being  implemented  service-wide  to  provide  for 
interconnection  of  various  elements  of  the  Navy's  logistics 
commun  i  ty . 

Concepts  of  SPLICE  network  management  were  proposed  in 
[ Re f .  1]  prior  to  selection  of  system  hardware  and  software. 
It  is  the  purpose  of  this  research  to  further  explore  certain 
of  those  concepts  that  are  pertinent  to  performance  evaluation 
with  the  objective  of  proposing  a  conceptual  model  of  a  CPE 
organization  that  may  be  implemented  to  realize  a  high  level 
of  SPLICE  system  performance. 


In  this  chapter  the  concept  of  local  area  network  (LAN) 
management  will  be  presented  as  a  logical  vehicle  through 
which  to  implement  such  a  performance  evaluation  effort.   LAN 
management  will  be  functionally  decomposed  and  followed  by  a 
discussion  of  computer  systems  performance  evaluation  and 
monitoring.   LAN  management  functions  will  be  identified  that 
are  fulfilled  by  the  evaluation  and  monitoring  process.   A 
brief  description  of  a  SPLICE  LAN  will  follow. 

Chapter  Two  will   provide  background  information  pertinent 
to  SPLICE  concepts  and  components.   Chapter  Three  will  discuss 
the  process  of  measurement   and  how  it  fulfills  performance 
evaluation.   Chapter  Four  will  see  the  development  of  a  model 
of  performance  evaluation  for  implementation.   In  Chapter  Five 
this  model  will  be  applied  to  a  SPLICE  LAN.   Chapter  Six  will 
present  a  summary  and  recommendations. 

A.   ASSUMPTIONS 

It  is  expected  that  the  reader  has  a  fundamental  knowledge 
of  information  processing  systems.   Principal  terms  relating 
to  performance  evaluation  and  communications  networks  are 
defined  in  Appendix  A. 

This  research  will  address  the  role  of  performance 
monitoring  in  up-and-runn ing  systems.   There  will  be  no 
discussion  of  the  use  of  performance  evaluation  tools  in  the 
design  phase  of  system  development.   Similarly,  we  will  assume 
fully  implemented  SPLICE  sites  in  this  discussion,  although  at 
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the  time  of  this  writing  only  partial  capabilities  have  been 
realized  at  a  handful  of  sites. 

We  will  consider  CPE  methodologies  appropriate  to  the 
characteristics  of  the  SPLICE  target  system,  e.g.  technologies 
pertinent  to  mul t iprogrammed  hosts  servicing  interactive  users 
on  a  baseband,  bus  local  area  network.   Existing  UADPS-SP 
hosts,  such  as  the  Burroughs  Medium  Systems,  will  not  be 
directly  included  in  the  performance  evaluation  discussion. 

B.   THE  FUNCTIONS  OF  LAN  MANAGEMENT 

The  management  of  a  local  area  network  is  a  multi-faceted 
endeavor  that,  in  the  opinion  of  Stallings  [Ref.  2:p.  354]: 
".  .  .  encompasses  those  tasks,  human  and  automated,  that 
support  the  creation,  operation,  and  evolution  of  a  network." 
Stallings  further  specifies  six  functional  areas  that  comprise 
network  management:  operations?  administration;  maintenance; 
configuration  management;  documentation/training;  data  base 
management;  planning;  and  security.   The  following  sections 
will  briefly  describe  each  functional  area.  (Ref.  2:p.  319J 

1.   Operations 

Everyday  operation  of  an  LAN  is  overseen  by  operations 
management.   Of  particular  concern  in  this  area  are  the 
monitoring  of  network  status  and  performance. 


1  1 


2.  Administrative  Management 

Administration  deals  with  day-to-day  management 
concerns  other  than  those  on-line.   Examples  include  assigning 
user  passwords  and  billing  users. 

3.  Maintenance  Management 

Network  maintenance  involves  the  process  of  detecting, 
isolating,  and  correcting  problems.   Both  hardware  and 
software  degradations  are  included. 

4 .  Configuration  Management 

This  area  describes  network  components,  and  controls 
the  changes  to  components  to  maintain  a  coherent  picture  of 
the  overall  network  structure. 

5.  Documentation/Training 

The  education  of  network  personnel  is  accomplished 
through  this  management  function.   It  includes  the  developing 
and  maintaining  of  documentation. 

6.  Data  Base  Management 

This  area  relates  to  the  capability  to  develop  and 
operate  a  data  base  for  overall  network  management. 

7.  Planning  Management 

The  planning  function  ensures  proper  sizing  and 
capacity  planning  for  the  existing  network  throughout  its  life 
cycle.   It  necessitates  an  on-going  requirements  analysis 
effort  to  ensure  network  planners  are  aware  of  the  system's 
needs . 
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8.   Security  Management 

The  security  function  ensures  network  integrity  by 
denying  access  to  all  but  authorized  users. 

C.   COMPUTER  PERFORMANCE  EVALUATION  OBJECTIVES 

A  conceptual  foundation  for  the  computer  performance 
evaluation  (CPE)  of  an  information-processing  system  is 
provided  by  Ferrari.   A  system's  performance  may  be 
objectively  measured,  calculated,  or  estimated  by  certain 
quantifiable  indexes.   These  indexes  may  be  given  different 
weights  by  different  people  involved  with  the  system,  or  even 
by  the  same  person  under  different  circumstances.   Ferrari 
puts  forth  four  objectives  for  performance  evaluation: 
procurement;  improvement;  capacity  planning;  and  design,  all 
of  which  are  discussed  below.  [Ref.  3:pp.  1-351 
1  .   Procurement 

Evaluation  for  procurement  reasons  includes  steps 
taken  to  choose  the  most  suitable  system  from  a  range  of 
alternatives.   Examples  include  an  installation's  design,  or 
hardware /software  configuration. 
2.   Improvement 

Performance  evaluation  is  commonly  performed  for  the 
purpose  of  improving  the  efficiency  of  an  existing  system. 
This  objective  will  prove  to  be  the  focus  of  our  research. 
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3.   Capacity  Planning 

This  category  of  evaluation  Involves  sizing  an 
information-processing  system  and  its  components  to  meet  the 
needs  of  its  users.   It  is  a  dynamic  function  that  is 
necessary  not  only  during  system  design  but  throughout  the 
system's  life-cycle  as  well. 
4.   Design 

Problems  faced  by  designers  during  the  creation  of  a 
new  system  belong  to  this  class. 

D.   CPE  TECHNIQUES 

Ferrari  describes  two  techniques  for  defining  performance 
indexes.  These  different  methods,  described  in  the  following 
sections,  are  measurement  and  modelling. 

i .  Modelling 

In  a  model,  a  representation  of  the  computer  system  is 
constructed  portraying  system  performance  characteristics  as 
accurately  as  possible.   Modelling  techniques  are  able  to 
evaluate  performance  even  though  the  target  system  may  not  be 
available  or  even  in  existence.   Thus  models  are  invaluable  in 
certain  situations,  for  instance  during  the  design  phase.   At 
the  same  time,  there  may  always  be  some  level  of  doubt  about 
modelling  results  until  they  are  validated  in  reality. 
2.   Measurement 

Empirical  performance  evaluations  are  made  directly 
from  the  target  system.   The  results  of  this  monitoring  must 
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be  taken  from  an  existing,  on-line  system,  although  a  test 
workload  may  be  more  convenient  and  efficient  to  evaluate  than 
an  actual  workload.   The  advantages  of  measurement  relative  to 
modelling  are  increased  accuracy  and  credibility.   Measurement 
is  a  focus  of  this  research. 

E.   SPLICE  OVERVIEW 

The  U.S. Navy  has  developed  the  SPLICE  concept  to  provide 
support  to  their  suite  of  Uniform  Automated  Data  Processing 
System  for  Stock  Point  (UADPS-SP)  sites.   SPLICE  provides  a 
minicomputer  interfaced  with  existing  UADPS-SP  machines  via  a 
local  area  network  for  the  purposes  of  absorbing  the 
communications  workload  and  providing  services  for  interactive 
processing,  front-end  processing,  Remote  Job  Entry  (RJE),  and 
terminal  concentrator  requirements.   SPLICE  will  also  provide 
a  standardized  telecommunications  interface  among  the  62 
log  i  st  ics  s  i  tes  . 

The  standard  SPLICE  minicomputer  is  the  Tandem  NonStop  TXP 
with  between  two  and  16  central  processing  units,  sized  for 
the  requirements  of  each  individual  site.   The  local  area 
network  selected  for  implementation  is  Network  Systems 
Corporation's  HYPERchannel ,  a  baseband,  CSMA-CD,  co-axial 
medium  with  the  capability  of  passing  up  to  50  megabits  per 
second  of  data  along  each  of  two  trunks.   SPLICE  sites  will  be 
interconnected  through  the  Defense  Data  Network  to  realize 
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vertical  and  horizontal  integration  among  different  elements 
of  the  Navy's  logistics  infrastructure. 

Installation  of  SPLICE  sites  is  underway  in  accordance 
with  a  schedule  that  calls  for  a  phased  implementation  of 
SPLICE  capabilities.   A  more  thorough  discussion  of  SPLICE 
concepts  and  design  features  may  be  found  in  [Ref.  41  and 
C  Re  f .  51. 

F.   RESEARCH  FOCUS 

In  discussing  system  performance  evaluation,  it  is  helpful 
to  distinguish  between  performance  evaluation  studies  and 
continuous  monitoring  activities.   Where  the  former  is 
characterized  by  an  effort  of  short  time  duration  and  is  often 
performed  as  a  result  of  specific  problem  symptoms,  the  latter 
may  be  carried  out  over  a  significant  portion  of  a  system's 
life  cycle  and  is  intended  to  identify  problems  as  they  arise. 
Both  of  these  activities  are  of  interest  and  will  be  addressed 
as  viable  methods  of  improving  SPLICE  network  performance. 

We  have  selected  improvement  of  network  efficiency  as  the 
CPE  objective  appropriate  to  this  study.   Since  the 
environment  in  this  research  involves  performance  evaluation 
of  a  local  area  network,  we  consider  it  appropriate  to  frame 
the  overall  discussion  in  terms  of  the  improvement  in  network 
performance  that  may  be  realized  through  the  CPE  technique  of 
measurement  of  performance  indexes.   When  the  objective  of 


16 


improved  performance  is  attained,  it  will  fulfill  certain  LAN 
management  functions. 

Specifically,  operations  management  may  be  carried  out 
through  the  CPE  processes  discussed  above  [Ref.   2:p.  3201. 
Similarly,  that  portion  of  maintenance  management  involved 
with  the  detection  of  system  failures  may  be  carried  out 
through  performance  monitoring.   Planning  activities  also  may 
be  based  on  accumulated  statistics  gathered  through 
performance  evaluation.   However,  planning  is  accomplished 
best  through  the  CPE  objective  of  capacity  planning  and 
therefore  will  not  be  considered  in  this  study.   This  research 
will  explore  ways  in  which  operations  management  and  that 
portion  of  maintenance  management  centering  on  detection  of 
component  failure  may  be  fulfilled  through  the  performance 
evaluation  process. 


17 


II.   SPLICE  BACKGROUND 

The  purpose  of  this  chapter  is  to  establish  the 
characteristics  of  the  Stockpoint  Logistics  Integrated 
Communication  Environment  that  are  pertinent  to  the 
application  of  performance  evaluation.   The  principal 
subsystems  of  the  network  will  be  described  along  with  the 
hardware  and  software  components  that  implement  them. 

Currently  the  computer  hardware  implemented  in  UADPS  stock 
points  is  composed  of  Burroughs  Medium  Systems 
(B-3500/3700/4700/4800) ,  various  Perkin-Elmer  suites,  and 
terminal  concentrator  devices  such  as  the  B867,  CP9400,  and 
CP9500.   Many  of  these  systems  are  approaching  obsolescence, 
and  others  are  experiencing  capacity  problems.   At  the  same 
time,  many  new  applications  are  being  developed  which  utilize 
interactive  processing  features  that  are  awkwardly  supported 
by  these  existing  equipments. 

SPLICE  systems  are  being  implemented  to  replace  the 
present  Burroughs-dependent  communications  processors  and 
provide  an  on-line  processing  capability  for  UADPS-SP.    There 
are  three  principal  SPLICE  objectives.   One  is  to  consolidate 
local  and  long-haul  communications  into  a  single  integrated 
network,  utilizing  the  Defense  Data  Network  (DDN).   Another  is 
to  relieve  the  currently  saturated  UADPS-SP  processors  by 
providing  interactive  transaction  processing  and  distributed 
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processing  among  SPLICE  nodes.   The  third  is  to  provide  a 
measure  of  system  integration  through  standardization  of 
hardware  [Ref.61. 

The  primary  thrust  of  SPLICE,  therefore,  is  to  provide  a 
base  for  communications  support  for  the  Navy  logistics 
community.   The  processing  power  of  SPLICE  hardware  also 
permits  UADPS-SP  applications  to  run  on  the  Tandem  systems, 
further  relieving  the  tasking  on  existing  Burroughs  machines. 
One  such  application  allows  for  replication  of  a  number  of 
UADPS-SP  files  onto  SPLICE  hardware.   As  the  master  Burroughs 
files  are  updated,  the  replicated  files  are  also  updated. 
Thus  queries  may  be  placed  against  the  replicated  files, 
reducing  the  traffic  that  must  pass  through  to  the  Burroughs 
Medium  System  host.   Another  program  is  Transaction  Ledger  on 
Disk  (TLOD)  which  maintains  historical  transaction  data  that 
allows  researchers  to  reconcile  inventory  levels  and  improve 
inventory  accuracy. 

A.   SPLICE  SUBSYSTEMS 

The  brief  overview  of  SPLICE  from  the  first  chapter  set 
forth  the  computing  areas  in  which  the  SPLICE  concept  would 
specifically  lend  support  to  the  current  UADPS-SP  environment 
telecommunications;  interactive  processing;  front-end 
processing;  remote  job  entry;  and  terminal  concentrator 
requirements.   This  support  will  be  implemented  through  a 
standardized  SPLICE  minicomputer,  the  Tandem  NonStop  TXP, 
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interfaced  with  UADPS-SP  processors  via  a  local  area  network, 
NSC's  HYPERchannel .   Functional  subsystems  of  SPLICE  are 
described  in  the  following  sections  [Ref.  4:pp.3-6  to  3-111. 
These  subsystems  are  executed  through  the  SPLICE  minicomputer, 
the  Tandem  NonStop  TXP.   It  should  be  noted  as  these 
subsystems  are  reviewed  that  only  a  portion  of  SPLICE  traffic 
will  be  seen  by  the  local  area  network.   For  instance,  queries 
against  the  activity's  data  base  will  be  processed  between  a 
user  terminal  and  replicated  files  accessed  directly  by  the 
Tandem  machine.   Therefore  it  is  necessary  to  measure  indexes 
external  to  the  local  area  network  to  ensure  that  SPLICE 
system  performance  is  accurately  portrayed. 

1.  Terminal  Management  Subsystem  (TM) 

The  TM  will  support  the  requirements  for  terminal 
handling,  security  access,  and  user  process  selection. 
Terminals  that  are  interfaced  through  TM  include  Burroughs 
TD-834/832  and  MT  983,  Teletype  Data  Speed  40/1  and  40/2, 
Zentec,  Ramtec  Omron  8030  B/8040  and  Rantec  8025  AG,  IBM 
2780/3780/3270,  and  various  others. 

2.  Transaction  Support  Processing  System  (TSP) 
Transaction  processing  services  that  are  supported  by 

SPLICE  are  made  available  to  users  through  this  subsystem. 
The  eight  TPS  components  are: 

-Exception  Processing 

-User  ID  Message  Access 

-Data  Communication  Network  (DCN)  Access 
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-Transaction  Processing  System  (TPS)  for  stand-alone, 
interactive  processing 

-UADPS-SP  Frame  Manager 

-Host  Unique  Pre-Process ing  for  BMS  and  other  host  interface 
functions  such  as  password  validation 

3.  Site  Management  Susbsvstem  (SMS) 

The  SMS  acts  to  provide  access  to  the  SPLICE  system  to 
CRT  users,  the  console  operator,  and  the  System  Administrator 
through  the  Tandem  minicomputer  command  interpreter. 

4.  Internal  Management  Subsystem  CIM) 

This  subsystem  controls  routing  of  files  and  data 
among  the  SPLICE  system  destinations:  the  Data  Communication 
Network,  the  Local  Computer  Network,  and  system  terminals.   It 
also  provides  for  monitoring  of  the  system  through  a  component 
called  the  Environment  Manager. 

5.  Data  Exchange  Subsystem  (DE) 

DE  controls  the  transfer  of  data  set  files  entering 
and  leaving  the  SPLICE  system. 

6.  Site  Data  Communications  Network  Control  Susbsvstem 
The  Site  DCN  Control  Subsystem  exercises  control  over 

the  SPLICE  system's  access  to  external  networks.   Tandem 
off-the-shelf  software  products  EXPAND  and  TRANSFER  provide 
this  communications  interface  with  satellite  and  other  complex 
s  i tes . 
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7.   Complex  Local  Computer  Network  Control  Subsystem 
This  subsystem  is  comprised  of  the  elements  that 
provide  the  physical  and  logical  connection  between  the 
HYPERchannel  local  area  network  and  the  various  hosts  in  the 
SPLICE  system. 

B.   THE  TANDEM  NONSTOP  TXP  MINICOMPUTER 

Computer  systems  manufactured  by  Tandem  Computers,  Inc. 
are  characterized  by  a  multiplicity  of  processors, 
controllers,  data  paths  between  system  modules,  and  power 
supplies.   There  are  at  least  two  of  each  of  these  components 
in  every  Tandem  system.   Thus  it  is  likely  that  some 
processing  will  be  possible  even  in  the  presence  of 
casualties.   In  normal  usage,  all  resources  may  be  employed  by 
the  workload.   When  a  casualty  occurs,  the  workload  from  the 
failed  component  is  automatically  assumed  by  another  unit. 
Thus  Tandem  systems  are  often  employed  in  environments  where 
continuous  availability  is  of  great  operational  importance. 
The  modularity  of  the  Tandem  architecture  allows  failed 
components  to  be  repaired  without  the  necessity  of  powering 
down  the  entire  system.   And  it  is  relatively  easy  to  add  or 
remove  components  from  the  system  as  the  workload  is  increased 
or  reduced.   The  NonStop  TXP  system  is  Tandem's  most 
sophisticated  product,  designed  for  heavy  transaction 
processing  applications  in  an  on-line  environment  and 
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incorporating  additional  performance  features,  such  as  cache 
memory,  not  shared  by  other  Tandem  models. 

Automatic  switching  of  components  occurs  when  the  primary, 
active  data  path  is  interrupted.   The  most  complex 
illustration  of  this  is  at  the  CPU  level.   When  an  application 
is  scheduled  and  run  on  a  NonStop  TXP  system,  it  is  assigned 
to  a  primary  CPU  and  a  secondary  CPU.   All  CPUs  communicate 
via  Tandem's  DYNABUS,  a  set  of  two  high  speed,  bidirectional 
buses  that  provide  for  CPU  interconnection.   A  process 
performs  the  application  in  the  primary  CPU  and  transmits  - 
periodic  signals,  termed  "I'm  alive"  signals,  to  the  secondary 
process  in  the  alternate  CPU  via  DYNABUS.   Should  the  signal 
fail  to  arrive  indicating  a  casualty  to  the  primary  CPU,  the 
secondary  CPU  is  ready  to  execute  the  application.   Thus 
duplicity  of  components  provides  a  stable,  fault  tolerant  data 
processing  environment  that  automatically  adjusts  for  system 
casualties  to  maintain  processing  continuity. 

C.   HYPERCHANNEL  LOCAL  AREA  NETWORK 

The  HYPERchannel  LAN  from  Network  Systems  Corporation  is 
capable  of  transmitting  data  at  rates  up  to  50  megabits  per 
second.   It  interfaces  a  relatively  small  number  of  devices 
over  a  distance  of  up  to  5000  feet.   These  characteristics 
classify  HYPERchannel  as  a  very  high  speed  local  computer 
network  tRef.  7:p.  2491.   (The  alternative  classification  is 
the  high  speed  local  computer  network  which  is  slower,  up  to 
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10  megabits  per  second,  but  accommodates  more  devices  over  a 
longer  distance).   The  HYPERchannel  bus  consists  of  up  to  four 
trunks  of  passive  coaxial  cable  that  can  interconnect 
heterogeneous  hosts  via  bus  interface  units  known  as 
HYPERchannel  adapters.   These  adapters  consist  of  three 
components:  trunk  interfaces,  a  microprocessor,  and  device 
interfaces.   The  device  interfaces  are  unique  for  each 
interconnected  host.   Trunk  interfaces  and  microprocessors  are 
identical  for  all  adapters.   The  functions  of  an  adapter 
include  trunk  selection,  trunk  access,  and  adapter-to-adapter 
interface.   The  adapters  support  protocols  for  each  function 
as  described  in  the  following  sections.  C Re f .  141 
1.   Trunk  Access  Protocol 

Adapters  gain  access  to  HYPERchannel  based  on  a 
carrier  sense  multiple  access  (CSMA)  scheme.   An  adapter  with 
data  to  transmit  will  sense  any  activity  on  the  bus  and  will 
defer  transmission  until  the  bus  is  silent.   As  soon  as  a  bus 
transmission  is  completed,  the  following  events  occur  in 
sequence : 

a.  Fixed  Delay 

There  is  a  delay  period  following  transmission  in 
which  the  adapter  that  received  the  last  message  transmits  a 
short  response  frame. 

b.  N  Delay 

After  the  initial  fixed  delay  has  elapsed, 
adapters  waiting  to  transmit  data  are  allowed  to  do  so  in  a 
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fixed,  prioritized  order.   Each  adapter  is  programmed 
according  to  its  priority  in  the  transmission  sequence.   Thus 
lower  priority  adapters  must  wait  for  a  predetermined  delay 
period  to  expire  before  they  can  transmit.   The  unique  time 
delay  for  each  adapter  is  known  as  the  N  delay, 
c.   End  Delay 

If  an  entire  N  delay  interval  elapses  without  a 
transmission,  then  an  end  delay  period  begins.   During  end 
delay,  any  adapter  may  transmit  as  long  as  the  bus  remains 
silent.   If  two  adapters  transmit  nearly  simultaneously,  a 
collision  may  result.   The  mechanism  for  synchronizing 
adapters  during  this  entire  transmission  cycle  is  a  timer  in 
each  adapter.   All  timers  are  disabled  and  initialized  during 
a  transmission.   When  the  bus  is  silent,  timers  are  enabled 
and  begin  to  count  through  the  periods  of  fixed  delay,  N 
delay,  and  end  delay. 

2.   Adapter-Adapter  Protocol 

Adapters  exchange  data  through  predefined  bit 
quantities  known  as  frames.   There  are  three  kinds  of  frames 
associated  with  HYPERchannel  adapters.   Transmission  frames 
are  short  in  length  and  function  to  exchange  control  signals 
between  two  adapters.   Examples  of  control  signals  would 
include  a  frame  indicating  the  end  of  a  message,  or  a  frame 
requesting  that  an  adapter  reserve  itself  pending  imminent 
traffic  from  the  sending  adapter.   A  second  frame  type  is  the 
data  frame.   A  data  frame  may  be  short  (38  bytes)  and  contain 
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a  complete  message,  in  which  case  it  is  termed  a  message 
proper  frame.   Or  it  could  contain  a  block  of  data  (up  to  2k 
bytes).   Both  transmission  and  data  frames  are  acknowledged  by 
the  receiving  adapter  by  way  of  a  response  frame. 

Viewed  at  the  next  higher  level,  frames  may  be  grouped  in 
sequences.   A  message-only  sequence  is  comprised  of 
transmission  frames  and  a  message  proper  frame  together  with 
appropriate  response  frames  from  the  receiving  adapter. 
Message-only  sequences  transmit  short,  complete  messages. 
Message-with-data  sequences  are  longer  and  Include  the 
lengthier  block  data  frames  for  transmission  of  bulk  data. 

A  virtual  circuit  between  two  adapters  may  be  established 
through  transmission  frames.   A  sending  adapter  may  request 
that  a  receiving  adapter  reserve  itself  pending  a  dedicated 
exchange  of  data  between  the  two  adapters.    If  the  receiving 
adapter  has  not  been  otherwise  reserved,  i.e.  it  is  idle,  it 
transmits  a  response  frame.   The  virtual  circuit  is  thus 
established  and  data  is  transferred  between  the  two  adapters 
via  one  of  the  frame  sequences  discussed  above. 

3.   Trunk  Selection  Protocol 

A  HYPERchannel  adapter  may  be  connected  to  up  to  four 
trunks,  with  identical  components  independently  serving  each 
trunk.   A  single  trunk  in  the  network  may  be  dedicated  to 
certain  hosts  by  programming  each  host's  adapter  to  either 
attempt  to  access  or  to  refrain  from  accessing  it.   Hence  each 
adapter  is  programmed  to  contend  for  certain  trunks  on  the 
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bus.   When  an  adapter  has  a  message  to  transmit  it  listens  in 
turn  to  each  trunk  it  is  scheduled  to  contend  for  until  it 
senses  an  idle  trunk. 
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III.  THE  MEASUREMENT  PROCESS 

Various  techniques  have  been  identified  through  which  the 
LAN  Management  functions  of  operations  and  maintenance 
management  may  be  carried  out.   In  this  chapter  we  will 
discuss  these  techniques,  emphasizing  how  network  management 
is  enhanced.   Then  the  discussion  will  become  more  specific 
with  regard  to  how  CPE  techniques  are  implemented.   We  will 
discuss  various  network  measurement  tools  and  we  will 
establish  what  kinds  of  information  network  managers  need  to 
conduct  meaningful  performance  evaluation.   Finally  we  will 
link  these  last  two  topics  to  determine  what  measurement  tools 
are  appropriate  for  a  generic  local  area  network. 

A.   IMPLEMENTING  OPERATIONS  MANAGEMENT 

It  is  our  opinion  that  the  goal  of  LAN  operations 
management  is  to  enhance  the  performance  of  the 
information-processing  system  through  an  evaluation  and 
monitoring  effort  that  may  be  divided  into  areas  of  on-line 
preventive  monitoring,  system  balancing  and  tuning,  program 
tuning,  and  system  validation. 

1.   On-line  Preventive  Monitoring 

This  function  Is  composed  of  measures  that,  in  the 
words  of  Kee  [Ref   8:pp. 124-6]  "...  ensure  that  the  network 
performs  properly."   He  advocates  the  monitoring  of  a  local 
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area  network  under  normal  conditions  to  establish  typical 
traffic  flows  and  loads.   The  purpose  of  this  is  to  establish 
a  reference  frame  from  which  to  identify  abnormal  network 
conditions  indicative  of  a  fault  or  inefficiency. 
Additionally,  knowledge  of  network  behavior  allows  operators 
to  identify  cyclic  states  such  as  activity  peaks  and  may 
facilitate  operator  actions  to  enhance  network  performance. 
We  recommend  that  this  function  further  include  monitoring  to 
ensure  that  a  network's  minimum  required  performance  standards 
are  being  met. 

Hence  this  activity  emphasizes  the  prevention  of 
performance  degradation.   By  contrast,  maintenance  management 
is  corrective  by  nature.   It  is  concerned  with  restoring 
network  casualties  and  returning  network  performance  to  full 
operabi 1 i ty . 

2.   System.  Tuning  and.  Balancing 

A  bottleneck  in  a  computer  system  [ Re f .   3:p.2411  is 
".  .  .  a  limitation  of  system  performance  due  to  the 
inadequacy  of  a  hardware  or  software  component  or  of  the 
system's  organization." 

Ferrari  [Ref  9:p.  3501  considers  a  balanced  system  to 
be  a  system  without  bottlenecks.   He  defines  tuning  [Ref  9:p. 
181  as  " .  .  .  the  adjustment  of  a  system's  parameters  to  adapt 
it  to  the  workload  of  an  installation."   One  goal  of  a 
performance  evaluation  study  is  to  improve  a  system's 
performance  through  identification  and  elimination  of 
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bottlenecks  via  tuning  and  balancing.   In  the  context  of  a 
local  area  network,  bottlenecks  nay  occur  in  such  components 
as  host  interface  adapters,  terminals,  terminal  concentrators, 
and  workstations  [ Re f  8:p.  1261.   A  balancing  action  might  be 
a  reconfiguration  of  network  resources  such  as  reallocation  of 
application  programs  among  hosts  in  order  to  remove  the 
bottleneck  in  a  particular  host  interface  adapter.   A  tuning 
action  might  be  adjustment  of  a  CSMA-CD  network  minimum  packet 
size  to  reduce  maximum  network  delay. 

One  new  idea  in  the  area  of  workload  balancing  may 
prove  useful  as  a  continuous  monitoring  technique.   The 
concept  of  dynamic  load  balancing  provides  for  connection  of  a 
user  to  the  network  host  facility  with  the  least  traffic.   An 
associated  concept  is  the  resource  eligibility  listing  service 
which  would  respond  to  a  user  query  by  listing  currently 
available  network  resources.   These  measures  would  act  to 
improve  a  network's  efficiency  by  steering  computing  activity 
away  from  heavily  used  resources  toward  more  responsive  ones. 
[Ref  10:p.  851 

3.   Program  Tuning 

The  process  of  optimizing  a  computer  system's  workload 
by  optimizing  individual  programs  and  the  mix  of  programs 
executed  on  a  host  is  called  program  tuning  [Ref  3:pp. 
309-3111.   This  also  is  a  technique  employed  as  part  of  a 
performance  evaluation  study.   It  is  a  process  complementary 
to  but  different  from  the  system  tuning  described  above. 
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a.  Optimizing  Individual  Programs 

This  action  has  two  objectives:  to  reduce  required 
memory;  and  to  reduce  host  CPU  execution  time.   Depending  on 
the  bottlenecks  in  a  particular  system,  one  of  these 
objectives  might  be  more  important  than  the  other.   The 
methods  of  program  optimization  are  not  pertinent  to  this 
research.   It  is,  however,  pertinent  to  note  that  program 
optimization  is  a  valid  avenue  to  explore  in  eliminating  LAN 
host  bottlenecks  and  improving  overall  performance. 

b.  Optimizing  Program  Mix 

Performance  evaluation  studies  frequently 
demonstrate  that  certain  programs,  termed  critical  programs, 
impact  significantly  on  overall  performance.   The 
identification  of  critical  programs  is  an  early  goal  of  any 
study  since  they  are  the  object  of  both  mix  and  program 
optimization.   A  critical  program  is  characterized  by  frequent 
execution  and  by  heavy  resource  demands.   Action  taken  to 
optimize  program  mix  in  a  local  area  network  context  would 
typically  involve  distributing  applications  among  hosts  to 
best  balance  composite  network  resources  with  application 
resource  demands. 

4.   System  Validation 

One  more  function  of  CPE  measurement  is  that  of 
validating  the  design  of  the  network  [Ref.  1 1 : p .  2881. 
Statistics  accumulated  over  time  that  support  network  design 
parameters  add  credence  to  the  validity  of  the  Initial  plan. 
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B.   IMPLEMENTING  MAINTENANCE  MANAGEMENT 

As  noted  above,  maintenance  management  is  corrective  in 
nature.   It  is  the  process  of  detecting  a  system  casualty, 
isolating  it,  and  correcting  it  to  restore  the  system  to  full 
operabillty.   It  is  helpful  here  to  differentiate  the  portion 
of  this  function  that  is  the  concern  of  performance 
evaluation.   Continuous  monitoring  techniques  provide 
operators  with  the  ability  to  detect,  isolate,  and  report 
system  problems  (Ref.  12:p.  641.   However,  network  monitoring 
systems  do  not  provide  the  ability  to  correct  casualties  [Ref. 
13:p.  97al.   Correction  is  largely  the  domain  of  operators 
controlling  the  network  from  a  centralized  facility.   Network 
control  centers  will  be  discussed  in  a  later  portion  of  this 
research. 

Detection  of  casualties  may  be  done  through  the  same  sort 
of  on-line  monitoring  process  described  above.   There  is  often 
a  fine  line  separating  degradation  from  failure  in  a 
component,  blurring  the  distinction  between  the  operations  and 
maintenance  management  functions.   However,  the  principle 
vehicle  is  the  continuous  monitoring  activity.   There  is 
essentially  no  role  for  performance  studies  in  this  management 
f unct  ion. 

C.   MEASUREMENT  TOOLS 

There  are  three  principal  tools  employed  in  the 
measurement  of  performance  parameters.   These  tools  are 
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accounting  software,  the  software  Monitor,  and  the  hardware 
Monitor.  C Re f .  1 1 :pp. 283-288 ] 

i.  Account  lag  Software 

It  is  coMMon  for  a  coiputer  system  to  include  software 
for  tracking  usage  of  its  resources  by  users  among  its 
capabilities.   Typical  Measures  would  include  CPU  utilization, 
I/O  accesses,  memory  required,  and  connect  time  per  user. 
Since  the  purpose  of  accounting  software  is  for  billing  the 
system's  users,  it  frequently  does  not  give  the  kind  of 
detailed  data  required  for  performance  evaluation. 

2.  Software  Monitor 

A  software  Monitor  is  comprised  of  code  embedded  in  an 
operating  system  to  gather  data.   If  it  is  event  driven,  data 
is  collected   only  for  certain  predefined  events.   If  sampling 
is  employed,  data  is  collected  over  time  according  to  a 
sampling  schedule.   There  are  disadvantages  to  a  software 
monitor.   Since  it  is  composed  of  executeable  code,  it  adds 
overhead  to  the  entire  system  and  impacts  on  overall 
performance.   Also,  it  is  specifically  designed  and  written 
for  its  target  system,  i.e.  is  not  transportable. 

3.  Hardware  Monitor 

A  hardware  monitor  is  made  up  of  digital  circuitry 
that  is  attached  to  the  target  system  at  predetermined  probe 
points.   As  the  target  system  changes  states,  data  is  passed 
through  the  probes  to  accumulators  within  the  monitor. 
Difficulties  with  hardware  monitors  include  complexities  in 
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identifying  the  correct  probe  points  in  different  computer 
systems,  and  the  limitations  to  collectable  information 

imposed  by  the  finite  number  of  probe  points.   On  the  other 
hand,  a  hardware  monitor  is  a  passive  device  that  does  not 

impact  on  system  performance,  yielding  more  accurate  data. 
Leach  discusses  three  types  of  hardware  monitors  applicable  to 
communications  networks  [Ref  1 4:pp. 48-49 1 . 

a.  Individual  Measurement  Tools 

These  machines  may  be  attached  to  a  single  device 
to  monitor  its  usage.   The  information  gathered  from  such  a 
monitor  will  be  too  restricted  to  measure  the  performance  of 
the  entire  network.   However,  it  is  a  very  effective  tool  to 
use  in  maintenance  management.   An  individual  monitor  may  be 
used  to  validate  a  trouble  call  on  a  device  to  determine 
whether  a  casualty  has  occurred. 

b.  Sub-Network  Measurement  Tool 

This  device  allows  the  gathering  of  information 
from  several  communications  lines  at  the  same  time,  making  it 
more  representative  of  network  performance. 

c.  Total  Network  Measurement  Tool 

Every  line  in  a  network  is  measured  continuously. 

D.   MEASURABLE  PARAMETERS 

In  this  section  we  will  identify  performance  indexes  that 
are  suitable  for  satisfying  the  CPE  objective  of  improved 
performance.   In  the  next  section  we  will  integrate 
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performance  indexes  with  the  measurement  tools  just  discussed 
to  form  a  CPE  vehicle  for  implementation  in  a  SPLICE  LAN. 

The  claim  has  been  made  that  CPE  is  not  a  science  but 
rather  an  art  [ Re f .  4:p.  51.   As  we  move  closer  to  application 
of  theory  to  fact  this  becomes  more  apparent.   It  is 
appropriate  at  this  point  in  our  research  to  select 
performance  indexes  that  will  best  fulfill  the  objectives  of 
measurement.   There  is  a  great  variety  of  options  and 
approaches  to  this  question.   We  will  present  a  number  of 
different  network  performance  philosophies  to  provide 
perspective  on  a  complicated  subject. 

Ferrari  discusses  quantitative  indexes  in  the  context  of 
information  processing  systems  in  general.   He  divides  a 
system's  performance  into  three  measurable  areas: 
productivity;  responsiveness;  and  utilization.   Indexes 
measuring  these  areas  are  detailed  in  Table  i.  (Ref.  9:pp. 
12-131 

Other  approaches  are  specific  to  local  area  networks. 
Franta  and  Chlamtac  set  forth  four  measures  of  performance: 
channel  utilization;  channel  capacity;  expected  message  delay; 
and  buffer  occupancy   [Ref.  15:pp.  169-1701.   They  point  out 
that  channel  utilization  and  throughput  are  equivalent 
measures  as  long  as  the  channel  is  not  saturated. 

Previous  research  into  SPLICE  network  management 
Illustrates  other  methods  [Ref.  l:pp. 55-561.   One  method 
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TABLE  1 

COMPUTER  SYSTEM  PERFORMANCE  INDEXES 

INDEX  CLASS  INDEX 

Productivity  Throughput  rate 

Production  rate 
Capacity  (laxiiui 

throughput  rate) 
Instruction  execution 

rate 
Data-processing  rate 

Responsiveness  Response  tine 

Turnaround  tine 
Reaction  tine 

Utilization  Hardware  nodule  (CPU, 

memory, I/O  channel, 
I/O  device) 
Operating  systen  nodule 


measured  acquisition  probability,  wait  time,  and  channel 
efficiency  to  gauge  the  performance  of  a  contention  network 
[Ref.  16:p.4011.   Another  measured  communication  capability 
via  throughput,  response  time,  and  file  transfer  rate;  and 
resource  utilization  of  processor,  buffer,  and  connunicat ion 
1 i  ne  [ Re  f .  1 7 : p .  481. 

Kee  recommends  monitoring  three  different  areas:  load; 
traffic  information;  and  failures.  We  will  discuss  these 
areas  briefly. 

Awareness  of  a  network's  load  allows  operators  wide  freedon 
of  action  in  fulfilling  their  nanagenent  functions.   Changes 
in  a  network's  load  may  provide  operators  with  the  opportunity 
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to  take  actions  to  leet  approaching  conditions  before  any 
service  degradation  occurs.   Thus  the  preventive  monitoring 
function  of  CPE  nay  be  greatly  facilitated.   System  and 
program  tuning  require  accurate  load  information  to  establish 
baseline  references  for  their  results.   And  the  validation  of 
network  design  depends  on  whether  the  medium  is  able  to 
function  under  real  load  conditions  as  planned. 

Useful  information  can  be  derived  from  traffic 
information/  i.e.  counts  of  messages  among  a  network's  various 
nodes.   Traffic  bound  for  or  arriving  from  an  external  network 
is  also  of  interest.   Knowledge  of  error  counts  will  assist 
operators  in  identifying  failing  or  failed  components. 
Message  counts  will  also  help  them  to  tailor  their  actions  to 
respond  to  real-time  traffic  flow.   Performance  studies  will 
benefit  through  identification  of  possible  bottleneck  sources. 
Traffic  counts  will  also  serve  to  validate  the  network's 
original  design. 

Detection  of  failed  components  Is  a  continuous  monitoring 
activity  essential  to  the  fulfillment  of  maintenance 
management.   Network  operators  must  maintain  a  real-time 
status  of  every  network  resource  to  serve  as  a  monitoring 
reference.   Corrective  on-line  monitoring  Is  the  function 
served  through  failure  detection. 

Finally,  Leach  measures  three  areas  through  the  Computer 
Management  Systems  Network  Performance  Monitor:  determination 
of  end  user  satisfaction;  problem  recognition;  and  capacity 
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planning  tRef.  14:p.  48-491.   Categories  constituting  the 
first  two  areas  along  with  specific  indexes  are  detailed  in 
Tables  2  and  3. 


TABLE  2 

COMPUTER  MANAGEMENT  SYSTEMS  NETWORK 
PERFORMANCE  MONITOR  USER  SATISFACTION 


CATEGORY 
Response  Tine 


Indicators  of  Failure 


Ratios  of  Failure/Inefficiency 


Management  Reports 
Summarized  By 


INDEX 

Poll/Poll  Time 
Poll /Response  Time 
CPU  Walt  Time 
CPU  Message  Time 
Terminal  Message  Time 
Response  Time 

#  Timeouts 

#  NAKs 

#  Sense  Messages 

#  Timeouts/  #  Polls 

#  Frmr  &  Frames 
Re  trans mi  tted/# 
Messages 

#  Sense  Messages/ 
#  Messages 

Terminal  (Physical 

Address) 
Control  Unit 
Line 

Locat  ion 
Program 
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TABLE  3 

COMPUTER  MANAGEMENT  SYSTEMS  NETWORK 
PERFORMANCE  MONITOR  PROBLEM  RECOGNITION 


CATEGORY 
Counts  of  Error 

Ratios  of  Error 

Measure  of  Service  Level 


Management  Reports 
Summarized  by 


INDEX 

#  NAKS 

#  Sense  Messages 

#  Timeouts 

#  Timeouts/  #  Polls 

#  NAKS/  #  Messages 

#  Sense  Messages/*  Messages 

Response  Time 
CPU  Wait  Time 
Overhead  Factor  (%) 
Utilization  Factor  (\) 

Terminal  (Physical  Address) 

Control  Unit 

Line 

Locat  ion 

Program 
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IV.  MODEL  DEVELOPMENT 

We  propose  to  bring  performance  evaluation  techniques  to 
bear  on  a  local  area  network  model  in  an  effort  to  illustrate 
how  overall  network  performance  may  be  facilitated. 
Implementation  will  take  place  along  the  two  central  paths  of 
CPE,  continuous  monitoring  activities  and  performance  studies. 
A  personnel  structure  to  best  support  both  efforts  will  also 
be  discussed.   For  our  underlying  network  we  will  address  the 
SPLICE  configuration  of  a  small  (2-4)  number  of  hosts 
(including  a  Tandem  NonStop  TXP)  linked  via  HYPERchannel . 

A.   APPLICATION  OF  PERFORMANCE  STUDIES 

The  concept  of  conducting  performance  studies  through  a 
site  computer  performance  evaluation  team  is  discussed  by 
Morris  and  Roth  t Re f .  18:pp.  19-501.   The  following  points  are 
based  on  concepts  from  that  source.   The  authors  emphasize 
that  their  treatment  Is  not  academic  but  rather  Is  based  on 
unverified  empirical  experiences.   It  is  offered  here  as  a 
practical  discussion  of  an  Important  subject  that  has  not  seen 
much  quantifiable  research. 

To  begin  with,  there  Is  the  very  real  question  of  whether 
It  Is  feasible  for  an  activity  to  become  involved  with  CPE 
studies  at  all.   There  are  two  primary  reasons  why 
organizations  do.   One  reason  is  that  computer  system  problems 
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■ay  mandate  performance  evaluation  in  an  effort  to  find 
solutions.   A  computing  activity  confronting  a  substantial 
addition  to  its  workload  may  employ  CPE  to  find  the  best  ways 
to  stretch  its  existing  resources.   There  are  numerous  valid 
reasons  to  cause  an  organization  to  look  deeply  into  its  own 
operation,  and  CPE  provides  a  valid  avenue  for  that 
introspection.   A  second  reason  is  simply  the  expected  return 
on  investment.   For  instance,  one  industry  guideline  is  that 
if  a  modest  increase  in  computer  system  productivity  of  2%  to 
5\  is  enough  to  save  the  cost  of  two  to  three  technicians,  CPE 
is  probably  a  productive  investment. 

Once  a  decision  has  been  made,  by  either  of  the  above 
criteria,  to  employ  a  performance  study  methodology,  its  first 
efforts  should  be  aimed  at  system  resource  hogs,  i.e.  those 
applications  that  account  for  the  most  resource  usage. 
Another  industry  rule  of  thumb  maintains  that  10%  of  the 
applications  running  on  an  information  processing  system 
account  for  more  than  50\  of  its  resources.   Although  not 
scientific,  this  serves  to  assert  a  point  that  early  CPE 
efforts  should  identify  and  target  for  improvement  those 
objects  that  offer  the  most  potential  for  improvement. 

Goals  should  be  modest  at  first.   For  instance,  it  may  be 
feasible  to  set  an  objective  of  reducing  resource  usage  of  the 
two  most  active  programs  by  5*.   Attainment  of  this  objective 
will  be  of  tangible  benefit  to  the  computer  system  and  result 
in  a  learning  experience  for  the  CPE  technicians  involved  in 
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the  effort.   Moreover,  others  will  become  aware  of  the 
possibility  of  improving  system  inefficiencies  and  may  well 
become  sources  for  further  improvement  suggestions. 

Another  assertion  by  Morris  and  Roth  is  that  the  greatest 
impact  of  performance  studies  will  be  realized  at  the 
beginning  of  the  CPE  effort  and  that  this  impact  will  decline 
as  continued  attention  results  in  improved  performance.   In 
determining  the  frequency  of  performance  studies,  a  guideline 
of  every  six  months  is  a  recommended  minimum.  With  all  other 
factors  constant  (e.g.  workload),  as  the  computer  system 
matures  the  frequency  of  measurement  will  decrease.   A  variety 
of  unique  circumstances  suggest  that  performance  studies  would 
be  appropriate.   When  new  applications  are  installed,  or  when 
the  mix  or  number  of  users  changes  significantly,  system 
performance  might  be  well  served  by  a  CPE  effort. 

Performance  studies  are  often  conducted  most  successfully 
at  sites  where  a  CPE  team  is  formally  constituted.   One 
configuration  that  is  frequently  effective  is  a  team 
composition  of  three  members  whose  total  combined  annual 
effort  in  the  CPE  area  is  at  least  one  man-year.   Therefore 
performance  studies  do  not  become  anyone's  full  time  job,  but 
are  a  serious  collateral  job  for  a  number  of  people. 
Desirable  backgrounds  for  CPE  team  members  include  systems  and 
application  programming  experience  and  experience  in  equipment 
maintenance.   These  guidelines  depend  on  variables  such  as  the 
size  of  the  investment  in  the  computer  system.   The  larger  the 
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value  it  represents,  the  tore  likely  the  benefit  fro*  frequent 
■easures.  The  more  stable  the  workload,  the  less  frequent  the 
required  measurement. 

B.   APPLICATION  OF  CONTINUOUS  MONITORING  ACTIVITIES 

Our  opinion  Is  that  numerous  valid  approaches  could  be 
developed  for  structuring  the  monitoring  of  a  SPLICE  LAN.   The 
author  will  adopt  that  of  Leach  [Ref.  141,  principally  because 
the  areas  of  performance  measurement  he  cites  correspond 
closely  to  the  LAN  management  areas  that  this  research  is 
intended  to  support.   The  capacity  planning  area  meets  the 
needs  of  the  planning  management  function  but,  for  reasons 
cited  earlier,  will  not  be  addressed.   The  problem  recognition 
area  seen  in  Table  3  will  provide  measurements  for  the 
maintenance  management  function.   And  the  user  satisfaction 
Indexes  such  as  those  in  Table  2  are  proposed  to  fulfill  the 
operations  management  function. 

It  Isn't  possible  to  directly  adopt  every  index  due  to 
differences  in  the  characteristics  of  the  SPLICE  LAN  and  the 
target  network  in  (Ref.  141.   Nor,  in  our  view,  is  it  desired. 
The  model  proposed  below  is  intended  to  be  as  simple  as 
possible,  offering  a  small  number  of  measurable  Indexes 
reflecting  desirable  Information.   More  Indexes  could  be 
Incorporated  to  Increase  the  Information  yield,  at  the 
discretion  of  the  Implementor. 
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1.   Measurement  of  User  Satisfaction 

Response  time  is  proposed  as  the  primary  index. 
Response  time  can  be  quantified  various  ways  to  offer  More 
refined  information,  however  itean  response  time  is  the  basic 
index  recommended.   An  indicator  of  failure,  or  reflection  of 
user  dissatisfaction,  may  be  gained  from  counting  NAKs  over 
time.   To  add  another  perspective,  a  ratio  of  failure  or 
inefficiency  may  be  calculated  by  comparing  the  number  of  NAKs 
to  the  number  of  messages  transmitted.   It  would  be  useful  to 
compile  these  measures  in  reports  summarized  along  the  same 
lines  as  the  system  in  Table  2: 
-By  terminal 
-By  control  unit 
-By  communication  line 
-By  location 
-By  program 
Such  categorization  provides  additional  valuable  information 
for  analysts. 

2.   Measurement  of  Problem  Recognition 

The  number  of  NAKs  is  proposed  as  the  principal  index 
of  casualties.   It  may  be  seen  that  faulty  adapters  and  many 
software  problems  will  manifest  themselves  directly  through 
unsuccessful  attempts  at  communication,  reflected  in  the 
control  frame  NAK.   A  different  perspective  may  be  gained  by 
calculating  the  ratio  of  NAKs  to  total  messages.   And  to 
reflect  the  level  of  service  provided,  response  time  may  be 
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used.   It  is  noted  that  response  1 1  me   and  NAK  count  are 
soiewhat  compl i lentary  Measures.   A  casualty  in  a  host  adapter 
may  prevent  that  host  from  transmitting  data  on  the 
HYPERchannel ,  yielding  an  aberrant  response  time.   If  the  host 
had  no  traffic  to  send  its  casualty  would  be  discovered  when 
its  adapter  failed  to  properly  receive  and  receipt  for  message 
sequences,  yielding  a  large  NAK  count.   The  same  report 
summaries  seen  in  the  user  satisfaction  measurement  are 
recommended  here  as  well. 

3.  Model  Summary 

The  following  indexes  would  require  continuous 
measurement : 

-Mean  response  time 

-#  NAKs 

-#  messages 
An  efficient  method  of  monitoring  response  time  is  the 
sub-network  measurement  tool  discussed  previously,  which  will 
measure  a  number  of  different  devices  at  once.   The 
measurement  of  NAKs  and  messages  suggest  a  software  monitor, 
necessitating  embedding  code  in  the  HYPERchannel  operating 
system. 

4.  Establishing  Performance  Standards 

One  of  the  previously  stated  objectives  of  continuous 
monitoring  is  to  provide  network  operators  with  a  picture  of 
normal  network  operation  so  that  they  can  discern  abnormal 
conditions.   This  allows  them  to  take  preventive  action  to 
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avoid  system  degradations.   It  is  logical  to  define  a  normal 
range  of  conditions  in  some  manner  that  will  permit  operators 
to  easily  identify  aberrant  conditions.   A  set  of  performance 
standards,  defined  through  observation  of  network  behavior 
over  t i »e ,  would  meet  this  need.   Such  a  set  of  standards 
could  be  accumulated,  defined,  and  modified  on  the  local  site 
level  and  in  an  informal  manner. 

C.   NETWORK  CONTROL  CENTER 

Closely  wedded  to  the  process  of  continuous  monitoring  is 
the  concept  of  the  Network  Control  Center  (NCC).   Its 
functions  include  continuous  monitoring  of  the  network,  and 
also  the  capability  of  isolating  faults  and  the  capability  to 
manually  manipulate  LAN  configuration  [Ref.  l:p.  321].   Thus 
we  find  full  expression  of  the  LAN  maintenance  function,  as 
well  as  the  operations  management  function.   Rarely,  though, 
does  communication  network  technology  allow  complete 
implementation  of  all  this  functionality.   If  we  define  the 
suites  of  equipment  intended  to  implement  the  NCC  as  network 
management  systems  we  may  discuss  five  separate  varieties  of 
such  system  [Ref.  13:p.  102a]. 

One  category  of  network  management  system  is  comprised  of 
host-based  software.   This  software  gathers  status  information 
about  intelligent  terminals  and  communications  controllers 
through  continuous  on-line  testing.   Thus  there  is  a  running 
status  that  provides  for  notification  of  an  operator  should  a 
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component  develop  trouble  or  an  error  threshold  be  reached. 
This  information  is  useful  but  usually  is  not  a  very  thorough 
performance  evaluation.   And  there  is  no  capability  for 
re con f  igurat  ion. 

Network  control  systems  are  vendor  supplied  systems  that 
monitor  and  exercise  some  control  over  a  narrow  range  of 
network  functions.   Their  purpose  is  to  manage  modems  and 
multiplexers.   They   perform  continuous  monitoring  of  this 
equipment,  they  can  run  diagnostic  tests  and  loopback  tests  to 
isolate  casualties,  and  they  can  switch  from  a  failed  port  on 
a  modem  or  multiplexer  to  an  active  one.   Their  modest  scope 
of  action  often  limits  their  effectiveness. 

Technical  control  systems  are  essentially  systems  of 
diagnostic  equipment  such  as  data  monitors,  datascopes,  and  VF 
test  equipment.   Their  purpose  is  to  troubleshoot  faults  on 
communications  lines  and  circuits.   They  are  expensive,  have 
no  monitoring  capability,  and  provide  only  a  line  level  view 
of  performance. 

Network  monitoring  systems  are  valuable  continuous 
monitoring  tools.   They  perform  the  essential  operations 
management  functions  of  measuring  network  performance  and 
alerting  operators  to  casualties.   But  they  do  not  have  the 
control  capabilities  to  reconfigure  and  correct  problems. 

Digital  and  analog  switching  systems  are  matrix  switches 
that  permit  prompt  reconfiguration  of  lines  and  equipment. 
Some  of  these  systems  provide  information  about  network 
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configuration  and  some  allow  control  of  diagnostic  and  test 
equipment  attached  to  the  switch.   However,  none  have 
performance  measurement  capabilities. 

Some  networks  use  combinations  of  the  above  systems  to 
complement  each  others'  capabilities  in  order  to  achieve  full 
network  control  center  functionality.   A  typical  small-scale 
configuration  for  a  bus  topology  LAN  Is  comprised  of  a  central 
activity  which  connects  to  the  bus  through  an  adapter  and 
contains  a  keyboard,  a  microcomputer,  and  the  personnel 
necessary  to  observe  the  network  and  exercise  control.   This 
would  essentially  combine  a  network  monitoring  system  and  a 
digital  switching  system  to  constitute  an  NCC. 
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V.   APPLY IMG  THE  MODEL  TO  SPLICE 

The  purpose  of  this  chapter  is  to  compare  the  attributes 
of  our  performance  evaluation  model  to  existing  SPLICE  system 
capabilities  to  determine  whether  there  are  areas  of 
significant  difference.   We  propose  to  bring  performance 
evaluation  techniques  to  bear  on  a  SPLICE  local  area  network 
in  an  effort  to  improve  overall  network  performance  in  support 
of  the  LAN  management  functions  of  operations  and  maintenance 
management.   This  discussion  will  be  divided  into  the  areas  of 
continuous  monitoring  and  performance  studies. 

A.   CONTINUOUS  MONITORING  CAPABILITIES 
1.   XRAY 

The  principle  monitoring  tool  at  the  disposal  of 
SPLICE  operators  is  a  Tandem  off-the-shelf  performance 
software  package  called  XRAY.   Resident  in  the  NonStop  TXP, 
XRAY  conforms  to  our  definition  of  a  host-based  software 
system.   The  following  description  of  XRAY  is  drawn  from  tRef. 
191  . 

XRAY  runs  on  most  Tandem  systems,  Is  fully  Integrated 
with  the  Tandem  operating  system  and  their  off-the-shelf  data 
base  and  communications  programs,  and  can  be  operated  from  any 
asynchronous,  point-to-point  terminal.   A  large  number  of 
computer  system  components  may  be  the  subject  of  data 
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gathering.   In  XRAY  parlance,  components  are  called  "entities 
and  include  physical  devices,  (such  as  CPUs,  printers,  tape 
drives,  discs,  conun  icat  i  ons  lines,  and  terminals), 
processes,  disc  files,  and  other  computer  systems  in  the  same 
network.   XRAY  is  a  sampling  program  which  collects  data  at 
user  specified  intervals.   Performance  measurements  are  made 
through  a  four  step  process  which  involves: 

-stating  the  measurement  question 

-selecting  the  entities  to  be  measured 

-making  the  measurement 

-analyzing  the  data 

A  statement  of  the  problem  is  prerequisite  to  all  CPE 
efforts  and  is  not  unique  to  measurement  via  XRAY.   It 
involves  a  clear  articulation  of  the  perceived  problem  and 
must  be  based  on  sound  knowledge  of  the  application  being  run 
on  the  system  and  the  physical  hardware  configuration 
involved. 

The  next  step  is  to  decide  what  software  and  hardware 
components  are  to  be  measured.   A  configuration  file  is 
created,  specifying  the  entities  involved.   Principal 
configuration  options  include: 

-buffer  activity 

-physical  devices 

-files 

-application  programs 

-host  system  or  remote  host 
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A  configuration  file  can  be  created  at  any  time,  including 
during  execution  of  a  leasureient.  The  new  file  is  then 
immediately  implemented.  Within  an  application  program,  a 
code  range  may  be  specified  to  measure  which  sections  of  a 
target  program  are  consuming  the  most  CPU  time.  Along  with 
the  configuration  file,  a  separate  data  file  is  created  to 
serve  as  the  collection  point  for  measurements. 

Measurements  are  made  by  executing  the  configuration 
and  data  files  through  a  Tandem  program  called  XRAYCOM. 
XRAYCOM  also  specifies  the  time  interval  of  data  sampling  (one 
to  3600  seconds). 

Analysis  of  the  collected  data  is  done  by  a  program 
called  XRAYSCAN.   This  program  is  run  on  the  data  file  either 
after  the  measurement  period  or,  if  on-line  monitoring  is 
desired,  during  the  measurement  period.   Analysis  can  be 
conducted  on  any  entity  by  specifying  it  in  XRAYSCAN. 
Information  can  be  presented  in  any  of  three  formats:  reports; 
time  plots;  and  histograms.   A  range  of  performance  indexes  is 
measured  on  each  entity.   For  instance,  the  following  items 
can  be  derived  from  the  measurement  of  a  terminal: 

-response  time 

-terminal  rate 

-read  rate 

-write  rate 

-byte  rate 

-transmission  rate 
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A  complete  list  of  measurable  performance  indexes  may  be  found 
in  [Ref.  19:p.  4-691. 

A  number  of  performance  evaluation  applications  result 
from  XRAY's  measurement  capabilities.   XRAY  is  capable  of: 

-mix  balancing 

-capacity  planning 

-application  tuning 

-continuous  monitoring 
The  capabilities  of  this  software  monitor  make  it  a  powerful 
tool  in  measuring  the  performance  of  the  Tandem  computer. 
Since  it  is  resident  in  the  SPLICE  front-end  processor  it  is 
able  to  measure  parameters  associated  with  all  SPLICE 
subsystems . 

This  is  of  particular  interest  in  the  measurement  of 
the  Terminal  Management  Subsystem  (TM).   Since  all  terminals 
are  interfaced  through  TM  it  is  possible  to  monitor  the 
response  times  at  individual  devices  without  the  need  for 
separate  hardware  monitors.   The  calculation  of  response  time 
is  slightly  different  because  it  is  referenced  from  the  CPU 
and  not  the  device.   However,  since  this   difference  is 
constant  across  all  measurements  it  is  a  matter  of  definition 
and  not  substance . 

2.   NfftworK  XRftY 

Also  of  interest  is  XRAY's  ability  to  monitor  the  DDN 
gateway.  Current  SPLICE  planning  involves  interconnection  of 
the  62  separate  SPLICE  sites  using  the  Defense  Data  Network  as 
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the  wide  area  network  medium.   Interface  to  the  DON  will  be 
via  Tandem's  EXPAND  and  TRANSFER  off-the-shelf  software 
packages  [Ref.  4:p.  4-281.   The  use  of  Tandem  software  will 
permit  monitoring  of  selected  nodes  fro*  any  Tandem  host  using 
Network  XRAY.   Specific  capabilities  include: 

-Measurement  of  packet  traffic  with  every  remote  system  and 
the  measured  system,  with  distinction  between  "local" 
packets  and  forwarded  packets 

-Measurement  of  the  total  time  to  transmit  a  logical  message 
to  each  remote  system 

-Network  communications  line  utilization 

-Number  of  bytes  sent  and  bytes  received  in  Level  2  protocol 
communication  on  the  lines 

-Number  of  bytes  sent  and  received  in  level  4  protocol 
communication,  distinguishing  between  data  bytes  and 
control  bytes 

-Number  of  messages  sent  which  were  smaller  than  64,  128, 
256,  etc.  bytes  long,  respectively 

-Measurement  of  file  activity  against  files  opened  on  a 
remote  system,  on  a  per-f i le-open  basis,  tied  to  the 
opening  process 

-Measurement  of  disc  file  activity  on  the  measured  system 
which  originated  on  the  various  remote  systems,  again  on  a 
per-f i le-open  basis  .   [Ref.  19:p.  1-9] 

3.   NETEX 

Monitoring  of  the  HYPERchannel  local  area  network  may 

be  implemented  via  the  Network  System  Corporation's  operating 

system,  a  program  called  NETwork  Executive  or  NETEX.   NETEX 

provides  for  the  accumulation  of  network  utilization 

statistics  through  its  Network  Administrator  function  [Ref. 

5:p.  2-61.   There  are  no  preprogrammed  continuous  monitoring 
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modules  in  NETEX,  although  users  are  able  to  embed  their  own 
code  to  affect  a  software  monitor  [Ref.  201.   Moreover, 
although  several  commercially  marketed  hardware  monitors  are 
available  for  slower  speed  local  area  networks  such  as  Xerox's 
Ethernet,  none  is  available  for  high  speed  networks  such  as 
HYPERchannel. 

4.   Identifying  Performance  Standards 

It  might  seem  logical  to  look  to  the  SPLICE  benchmark, 
requirements  for  which  are  contained  in  [Ref.  211,  as  a  source 
of  usable  performance  measures.   Unfortunately  the  benchmark 
was  conducted  using  a  synthetic  test  workload,  i.e.  a  workload 
composed  of  a  combination  of  real  components  and  purposely 
constructed  components.   Although  its  on-line  performance 
parameters  are  defined  in  terms  of  response  time,  consistent 
with  our  model,  specific  standards  are  not  transferable  to 
actual  network  traffic.   We  propose  the  accumulation  of  a  CPE 
data  base,  as  recommended  in  [Ref.  18:p.  451  to  build  a  base 
for  employable  standards.   Since  the  workloads  vary 
significantly  among  SPLICE  sites,  this  effort  should  be 
undertaken  locally  and  used  locally. 

B.   PERFORMANCE  STUDY  CAPABILITIES 

It  is  unlikely  that  many  SPLICE  sites  will  realize  the 
kind  of  savings  through  performance  studies  that  would  justify 
the  dedicated  effort  described  in  the  last  chapter.   This  is 
not  to  say  that  no  such  effort  is  advisable.   SPLICE  systems 
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are  in  the  fledgling  stages  of  implementat ion  with  a 
burgeoning  workload  of  new,  interactive  applications  that  add 
up  to  a  dynamic  picture  of  performance  and  growth.   Local 
sites  may  find  themselves  engaged  in  performance  studies 
driven  by  unforseen  performance  problems.   Even  without  the 
purchase  of  specialized  CPE  tools,  local  site  personnel  have 
some  capability  at  their  disposal  through  the  continuous 
monitoring  capability  described  above. 

Another  logical  possibility  would  be  to  remove  the  regular 
execution  of  performance  studies  to  an  echelon  higher  than  the 
local  site.   Fleet  Maintenance  Support  Office  (FMSO),  the 
SPLICE  project  manager,  already  conducts  some  performance 
oriented  work  at  individual  sites.   FMSO  teams  visit  each 
SPLICE  site  prior  to  equipment  installation,  during  equipment 
installation,  and  at  intervals  thereafter  and  in  some 
instances  conduct  performance  evaluations  of  SPLICE  systems 
[Ref.  22].   It  may  be  feasible  to  implement  dedicated  SPLICE 
performance  study  teams  for  the  purpose  of  conducting  ongoing 
evaluations  of  SPLICE  sites.   This  would  enable  systematic 
tuning  of  hardware  systems  and  software  applications  while 
enjoying  the  benefits  of  continuity  and  the  experience  gained 
across  the  full  range  of  SPLICE  sites.   Whether  this  is 
advisable  or  not,  it  is  certainly  a  capability  at  the  Navy's 
disposal . 
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D.   NETWORK  CONTROL  CENTER  CAPABILITY 

System  specifications  for  SPLICE  provide  for  exercising 
control  of  the  local  area  network  through  a  network 
administrator  utility,  allowing  an  operator  to  modify  LAN  node 
status,  control  access  to  resources,  and  display  current 
sessions.   This  control  is  exercised  through  an  operator 
console  communicating  through  a  host  to  a  HYPERchannel 
adapter.   Tables  resident  in  the  network  adapters  permit  this 
control  to  be  realized. 

Control  of  the  data  communications  network  is  implemented 
through  Tandem  software.   The  Network  Monitor  (NETMON)  program 
allows  logging  network  status  changes,  remote  system  processor 
status  changes,  and  a  display  of  network  volumes. 

It  is  possible  to  control  the  LAN  and  indeed  the  site  by 
centrally  locating  the  above  functions  together  with  a  vehicle 
for  monitoring  terminal  response  times  (XRAY  or  a  subnetwork 
measurement  tool /response  time  monitor). 
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VI.   SUMMARY  AMD  RECOMHgMDATIflMS 

Improvement  in  the  real-time  performance  of  a  SPLICE  local 
area  network  can  be  realized  through  application  of  computer 
performance  evaluation  techniques  to  the  network  and 
associated  hosts.   The  method  for  implementing  this 
performance  evaluation  process  is  the  measurement  of 
predefined  performance  indexes.   The  principal  measurement 
vehicles  are  continuous  monitoring  and  the  performance  study, 
both  of  which  are  pertinent  to  improvement  of  real-time 
performance.   Two  principal  management  areas,  operations  and 
maintenance  management,  will  benefit  from  this  overall 
improvement  in  performance.   Operations  management  may  be 
realized  through  on-line  monitoring  and  through  the  tuning  and 
balancing  of  hardware  systems  and  software  applications. 
Maintenance  management  is  principally  fulfilled  through 
on-line  monitoring. 

Performance  indexes  for  SPLICE  were  offered  that  would 
measure  these  areas.   We  recommended  realizing  operations 
management  functionality  by  gauging  user  satisfaction  and 
proposed  response  time  as  one  representative  measure.   The 
other  proposed  index  was  problem  recognition.   This  would 
fulfill  the  maintenance  management  function  and  could  be 
measured  through  a  count  of  NAKs,  a  HYPERchannel  subnetwork 
response  frame  indicating  non-receipt  of  a  transmitted  data 
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f rate .   Many  other  measures  are  possible  and  feasible  and 
could  be  employed  to  render  more  detailed  and  complete 
information  about  system  performance.    Response  time  could  be 
measured  through  a  hardware  monitor  or  through  existing  Tandem 
software.   NAKs  would  be  counted  through  a  software  monitor 
integrated  with  the  NETEX  operating  system. 

Performance  studies  are  best  conducted  by  dedicated  teams. 
SPLICE  sites  might  well  be  too  small  to  justify  the  investment 
in  manpower  and  resources  necessary  to  conduct  an  ongoing 
performance  study  program.   However,  it  might  be  practical  at 
a  h  igher  leve 1  . 

A  network  control  center  is  the  logical  site  for 
centralizing  the  performance  evaluation  functions  discussed  in 
this  research.   The  network  control  center  has  other  functions 
as  well,  principally  involving  control  of  the  network  and  host 
resources.   A  SPLICE  NCC  could  conceivably  combine  network 
control  functions  through  NETEX,  wide  area  network  traffic 
control  through  NETMON,  and  terminal  monitoring  through  XRAY. 

We  offer  the  following  recommendations  for  future 
cons  iderat  ion: 


-A  formal  performance  evaluation  effort  may  be  of  particular 
benefit  to  SPLICE  sites  due  to  the  dynamic  schedule  of 
interactive  applications  forseen  by  network  planners. 
Changing  workloads  and  an  expanded  user  base  are  conditions 
that  could  complicate  planning  estimates.   Performance 
evaluation  can  be  a  valuable  tool  for  validating  system 
des  ign. 

-Act  early  to  implement  CPE.   Don't  wait  until  it's  needed. 
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-The  performance  evaluation  efforts  currently  conducted  on 
existing  UADPS-SP  hosts  wight  be  integrated  with  the  CPE 
program  proposed  here.   It  night  help  cost  justify  a  local 
performance  study  team. 

-Local  SPLICE  sites  should  consider  implementing  a  Network 
Control  Center  to  centralize  the  control  and  monitoring 
functions  cited  above. 

■Local  SPLICE  sites  should  be  encouraged  to  conduct 
continuous  monitoring  activities  on  their  own  initiative, 
to  build  a  local  CPE  data  base,  and  to  formulate  their  own 
performance  standards. 

-Monitoring  of  wide  area  network  traffic  should  be  included 
as  a  function  of  the  SPLICE  NCC.   But  note  that  this 
capability  depends  on  implementation  of  interconnection  via 
Tandem  software  (EXPAND).   If  full  DDN  integration  is 
anticipated,  other  means  must  be  found  to  monitor  the 
gateway. 

■Fleet  Material  Support  Office  should  consider  formally 
constituting  performance  study  teams  to  perform  periodic 
system  and  software  tuning  at  SPLICE  sites.   This  means 
more  than  ensuring  that  a  newly  installed  application  runs. 
It  would  mean  frequent  balancing  of  total  system  resources 
to  increase  the  value  of  the  SPLICE  dollar  to  the  Navy. 

■Dynamic  Load  Balancing  and  a  Resource  Eligibility  Listing 
Service  are  low-effort  methods  of  improving  on-line  system 
performance . 
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APPEMDIX  A 
GLOSSARY  OF  PERFORMANCE  EVALUATION  TERMS 


-ACK 

A  HYPERchanne  1  conun  icat  ions  subnetwork  ten  referring  to 
the  response  frame  transmitted  by  a  receiving  adapter  when  a 
data  frame  has  been  received  correctly. 

-Availabll ity 

Ratio  of  the  time  during  which  the  network  is  working  to  the 

time  when  it  is  not.   [Ref.  8:p.  122] 

-Baseband 

The  system  whereby  digitally  encoded  information  is  directly 
connected  to  the  transmission  medium  without  being 
modulated.  [Ref.  23:p.  177] 

-Bottleneck 

A  limitation  of  system  performance  due  to  the  inadequacy  of 
a  hardware  or  software  component  or  of  the  system's 
organization.  [Ref.  3:p.  2411 

-Bus 

One  or  more  conductors  used  for  transmitting  signals  or 
power.   In  an  LAN,  a  bus  usually  is  in  a  broadcast 
transmission  mode.  [Ref.  23:p.  178] 

-Capac  i  ty 

The  maximum  theoretical  value  that  the  throughput  of  a 

system  can  reach.   [Ref.  3:p.  12] 

-Carrier  Sense  Multiple  Access  (CSMA) 

A  contention  algorithm  for  a  bus  LAN  whereby  a  station 
wishing  to  transmit  determines  first  whether  another 
transmission  is  in  progress  by  sensing  if  a  carrier  is 
present.  [Ref.  23:p.  1791 

-Concentrator 

A  communications  device  that  provides  communications 
capability  between  many  low  speed,  usually  asynchronous 
channels  and  one  or  more  high  speed,  usually  synchronous 
channels.   Generally,  different  speeds,  codes,  and  protocols 
can  be  accommodated  on  the  low  speed  side.  [Ref.  23:p.l82] 
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-Continuous  Monitoring  Activity 

An  activity  performed  for  a  substantial  portion  of  the 
lifetime  of  an  existing,  running  system.   Its  objective  is 
to  keep  the  system's  performance  under  observation  in  order 
to  detect  performance  problems  as  soon  as  they  arise.  [Ref. 
9: pp.  9-101 

-Evaluation  Study 

An  activity  generally  limited  in  time  which  is  usually 
triggered  by  the  identification  of  a  performance  problem  or 
the  suspicion  of  its  presence.   [Ref.  9:p.  101 

-Event 

A  change  of  state  in  some  component  of  a  system.   [Ref.  3:p. 

148] 

-Expected  Message  Delay 

The  average  delay  experienced  by  a  message  following  its 
arrival  to  a  node  until  its  successful  transmission, 
assuming  a  homogeneous  set  of  nodes,  that  is,  with  the 
arrival  process  of  all  nodes  being  governed  by  the  same 
exponential  distribution,  and  a  single  distribution 
governing  the  lengths  of  all  messages.  [Ref.  15:p.  1691 

-Frame 

In  bit-oriented  protocols,  the  vehicle  for  every  command, 
each  response,  and  all  information  that  is  transmitted. 
[Ref.  23:p.  1891 

-Front-end  computer 

A  communications  computer  associated  with  a  host  computer. 
It  may  perform  line  control,  message  handling,  code 
conversion,  error  control,  and  applications  functions  such 
as  control  and  operation  of  special  purpose  terminals.  [Ref. 
23:p.  1901 

-Gateway 

A  node  common  to  two  or  more  networks  through  which  data 
flows  from  network  to  network.  The  gateway  may  reformat  the 
data  as  necessary  and  also  may  participate  in  error  and  flow 
control  protocols.  Used  to  connect  LANs  employing  different 
protocols  and  to  connect  LANs  to  public  data  networks.  [Ref. 
23:p.  1901 

-Input  Load 

The  rate  of  data  generated  by  the  stations  attached  to  the 

local  network.   [Ref.  2:p.2361 
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-Local  Area  Network  (LAN) 

A  coiiunications  system  whose  dimensions  typically  are  less 
than  five  kilometers.   Transmissions  within  an  LAN  generally 
are  digital,  carrying  data  among  stations  at  rates  usually 
above  one  megabit  per  second.  [Ref.  23 :p.  1931 

-LAN  Management 

The  effort  that  encompasses  those  tasks,  human  and 
automated,  that  support  the  creation,  operation,  and 
evolution  of  a  network.  [Ref.  2:p.  3541 

-Loopback  Test 

A  test  in  which  signals  are  looped  from  a  test  center 
through  a  data  set  or  loopback  switch  and  back  to  the  test 
center  for  measurement.  [Ref.  23:p.  194) 

-Model 

A  representation  of  a  system  which  consists  of  a  certain 
amount  of  organized  information  about  it  and  is  built  for 
the  purpose  of  studying  it.  [Ref.  9:p.  191 

-Monitor 

A  mechanism  for  collecting  information  on  a  system's 

activity.   [Ref.  3:p.l491 

-Mult  iplex  ing 

The  support  on  a  single  physical  link  of  two  or  more  logical 
links.   A  device  which  performs  this  function  is  called  a 
multiplexor.   Unlike  a  concentrator,  a  multiplexor  is  not 
programmable.  [Ref.  23:p.  1951 

-NAK 

A  HYPERchannel  communications  subnetwork  term  referring  to 
the  response  frame  transmitted  by  a  receiving  adapter  when  a 
data  frame  has  not  been  correctly  received. 

-Network  delay 

The  time  required  for  a  message  to  be  transmitted  from  a 

source  and  accepted  at  the  designated  sink.   [Ref.  24:p.  451 

-Node 

A  point  where  one  or  more  functional  units  interconnect  data 
transmission  lines.   Distributed  system  nodes  include 
information  processors,  network  processors,  terminal 
controllers,  and  terminals.  [Ref.  23:p.  196] 

-No  Polls 

Cessation  of  a  polling  routine  indicating  failure  of  a  host, 

front-end  processor,  or  modem. 
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-Offered  Load 

The  total  rate  of  data  presented  to  the  network  for 

transmission.   [ Re f .  2:p.236J 

-Packet 

A  group  of  binary  digits,  including  data  and  call  control 
signals,  switched  as  a  composite  whole.   The  data,  all 
control  signals  and  possible  error  control  information,  are 
arranged  in  a  specific  format.  (Ref.  23:p.  197J 

-Performance  Index 

A  descriptor  which  is  used  to  represent  a  system's 

performance  or  some  of  its  aspects.   (Ref.  9:p.  Ill 

-Po int-to-Po int  Connection 

A  connection  established  between  only  two  data  stations  for 

data  transmission.  [Ref.  23:p.  198] 

-Poll ing 

Interrogation  of  devices  to  avoid  contention,  determine 
operation  status  or  determine  readiness  to  send  or  receive 
data.   In  data  communications,  the  process  of  Inviting  data 
stations  to  transmit,  one  at  a  time.  (Ref.  23:p.  198] 

-Program  Tuning 

The  process  of  optimizing  a  computer  system's  workload  by 
optimizing  Individual  programs  and  the  mix  of  programs 
executed  on  a  host.   [Ref.  3:p.  3091 

-Product  I v I ty 

The  volume  of  Information  processed  by  the  system  In  the 

unit  time.   [Ref.  9:p.  131 

-Query 

In  interactive  systems,  an  operation  at  a  terminal  that 

elicits  a  response  from  the  system.  [Ref.  23:p.  1991 

-Rel iabll Ity 

The  extent  to  which  the  network  performs  in  the  expected 

manner.   (Ref.  8:p.l22] 

-Remote  Job  Entry  (RJE) 

Submission  of  jobs  through  an  Input  unit  that  has  access  to 

a  computer  through  a  data  link.  (Ref.  23:p.  199] 

-Respons I veness 
(See  Response  Time) 
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-Response  time 

The  tine  interval  between  the  instant  the  inputting  of  a 
command  to  an  interactive  system  terminates  and  the  instant 
the  corresponding  reply  begins  to  appear  at  the  terminal. 
[ Re  f .  3 : p .  11  J 

-Synthetic  Test  Workload 

May  consist  either  of  a  subset  of  the  basic  components  of 
the  real  workload  (a  natural  synthetic  workload),  or  of  a 
mixture  of  real  workload  components  and  purposely 
constructed  components  (a  hybrid  synthetic  workload).  [Ref. 
3:p.  531 

-System  Tuning 

The  adjustment  of  a  system's  parameters  to  adapt  it  to  the 

work  load  of  an  installation.   [Ref.  9:p.  181 

-Throughput 

The  amount  of  work  performed  by  a  system  in  a  given  amount 

of  time.   [Ref.  3: p.  Ill 

-Timeouts 

Failure  of  a  control  unit  or  terminals,  manifested  as  no 

response  to  a  poll. 

-Trace 

A  sequence  of  events  in  chronological  order.   [Ref.  9:p.  301 

-Ut 1 1 izat  ion 

The  ratio  between  the  time  a  specified  part  of  the  system  is 
used  (or  used  for  some  specified  purposes)  during  a  given 
interval  of  time  and  the  duration  of  that  interval.  [Ref. 
9:p.  131 

-Virtual  Circuit 

A  communications  path  established  by  computerized  switching. 
The  virtual  circuit  exists  only  while  It  Is  carrying  data. 
[Ref.  23:p.  2051 

-Wide  Area  Network 

A  data  communications  network  designed  to  serve  an  area  of 

hundreds  or  thousands  of  miles.   [Ref.  23:p.  2051 
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